Skip to content

Conversation

@fujitatomoya
Copy link
Collaborator

Description

Fixes #1097

Possible fix for #1088, at least my local environment test results are all green and stable.

Is this user-facing behavior change?

No, only test cleanup process is fixed.

Did you use Generative AI?

No

Additional Information

This PR includes #1096

Comment on lines +68 to +78
RegisterEventHandler(OnShutdown(on_shutdown=[
# Stop daemon in isolated environment with proper ROS_DOMAIN_ID
ExecuteProcess(
cmd=['ros2', 'daemon', 'stop'],
name='daemon-stop-isolated',
# Use the same isolated environment
additional_env=dict(additional_env),
),
# This must be done after stopping the daemon in the isolated environment
ResetEnvironment(),
])),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix to avoid leaving the ros2 daemon processes with different ROS_DOMAIN_ID.
Right after the test, it makes sure clean up the ros2 daemon process in isolated environment, in this way we can only shut down the ros2 daemon instantiated by this isolated environment.
ros2 daemon stop above can stay there to stop the ros2 daemon with default ROS_DOMAIN_ID if that is online.

Comment on lines -133 to -137
# TODO(@fujitatomoya): rmw_zenoh_cpp is instable to find the endpoints, it does not
# matter if DaemonNode or DirectNode is used. For now, skip the test for rmw_zenoh_cpp.
if self.rmw_implementation == 'rmw_zenoh_cpp':
raise unittest.SkipTest()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another fix to support zenoh here. It now uses rmw isolated environment for the test, that will call rmw_zenoh specific setup for the isolation.

@fujitatomoya fujitatomoya requested a review from Copilot September 4, 2025 13:29
@fujitatomoya
Copy link
Collaborator Author

Pulls: #1098
Gist: https://gist.githubusercontent.com/fujitatomoya/eb78260e2f0cfbf35b13ff15cf65ed17/raw/b26f8d55728276d24e22769744150c82c4e491a8/ros2.repos
BUILD args: --packages-up-to ros2action ros2topic ros2node ros2param ros2service ros2lifecycle ros2doctor
TEST args: --packages-select ros2action ros2topic ros2node ros2param ros2service ros2lifecycle ros2doctor
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/16854

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@mjcarroll
Copy link
Member

Possible fix for #1088, at least my local environment test results are all green and stable.

While I'm hopeful this fixes it, I'm all green on my local build as well :(

Copy link
Member

@mjcarroll mjcarroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for tackling this @fujitatomoya, I will let @cottsay also give it a look

@fujitatomoya
Copy link
Collaborator Author

While I'm hopeful this fixes it, I'm all green on my local build as well :(

maybe port assignment depending on the test environment...not so sure... 😓 fingers crossed 🤞

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Contributor

@ahcorde ahcorde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Member

@cottsay cottsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. This is probably a good idea, and is well-implemented.

At large, this is even more evidence that we need a proper isolation mechanism for the daemon itself, but that's well outside the scope of this change.

@fujitatomoya
Copy link
Collaborator Author

all green, i will go ahead to merge this.

@fujitatomoya fujitatomoya merged commit f350134 into rolling Sep 4, 2025
2 of 3 checks passed
@fujitatomoya
Copy link
Collaborator Author

We don't need to backport this fix, EnableRmwIsolation is only available in rolling.

@cottsay
Copy link
Member

cottsay commented Sep 4, 2025

We don't need to backport this fix, EnableRmwIsolation is only available in rolling.

I do plan to backport that stuff to Kilted when we stabilize the tests on Rolling.

Blast545 pushed a commit that referenced this pull request Sep 29, 2025
* Enable RMW isolation for ros2doctor.test_report.

Signed-off-by: Tomoya Fujita <[email protected]>

* ResetEnvironment on shutdown is missing.

Signed-off-by: Tomoya Fujita <[email protected]>

* Clean up isolated ros2 daemon process when tests complete.

Signed-off-by: Tomoya Fujita <[email protected]>

* Clean up isolated ros2 daemon process for ros2topic.

Signed-off-by: Tomoya Fujita <[email protected]>

* Clean up isolated ros2 daemon process for ros2action.

Signed-off-by: Tomoya Fujita <[email protected]>

* Clean up isolated ros2 daemon process for ros2doctor.

Signed-off-by: Tomoya Fujita <[email protected]>

* Clean up isolated ros2 daemon process for ros2lifecycle and ros2node.

Signed-off-by: Tomoya Fujita <[email protected]>

* Clean up isolated ros2 daemon process for ros2param.

Signed-off-by: Tomoya Fujita <[email protected]>

---------

Signed-off-by: Tomoya Fujita <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

colcon test leaves zombie daemon process with different ROS_DOMAIN_ID

5 participants